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About the impossibility to prove P ^ NP or 
P = NP and the pseudo-randomness in NP 

Prof. Marcel Remon 

Abstract 

The relationship between the complexity classes P and NP is an unsolved question in the 
field of theoretical computer science. In this paper, we look at the link between the P - NP 
question and the "Deterministic" versus "Non Deterministic" nature of a problem, and more 
specifically at the temporal nature of the complexity within the NP class of problems. Let us 
remind that the NP class is called the class of "Non Deterministic Polynomial" languages. 



u 

u . 

^ ' Using the meta argument that results in Mathematics should be "time independent" as they 



are reproducible, the paper shows that the P ^ NP assertion is impossible to prove in the 
a-temporal framework of Mathematics. A similar argument based on randomness shows that 
the P = NP assertion is also impossible to prove, so that the P - NP problem turns out 
to be "unprovable" in Mathematics. This is not an undecidability theorem, as undecidability 



' points to the paradoxical nature of a proposition. In fact, this paper highlights the time 

On . dependence of the complexity for any ATP problem, linked to some pseudo-randomness in 

o 

its heart. 

Index Terms 

Algorithm Complexity, Non Deterministic Languages, P — NP problem, 3-CNF-SAT 
problem 

I. Introduction 

A. The class P of languages 

A decision problem is a problem that takes as input some string, and outputs "y^s" or "no" . 
If there is an algorithm (say a Turing machine, or a computer program with unbounded 
memory) which is able to produce the correct answer for any input string of length n in at 
M. Remon, Department of Mathematics, Namur University, Belgium; marcel.remon@fundp.ac.be 
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most c n*^ steps, where k and c are constants independent of the input string, then we say 
that the problem can be solved in polynomial time and we place it in the class P . 

More formally, P is defined as the set of all languages which can be decided by a deterministic 
polynomial-time Turing machine. Here we follow the framework proposed by Stephen [1]. 
Let E be a finite alphabet with at least two elements, and let S* be the set of finite strings 
over S. Then a language over S is a subset L of S*. Each Turing Machine M has an 
associated input alphabet S. For each string w in E*, there is a computation associated 
with M, with input w. We say that M accepts w if this computation terminates in the 
accepting state " Yes" . Note that M fails to accept w either if this computation ends in the 
rejecting state 'Wo", or if the computation fails to terminate. 

The language accepted by M, denoted L{M), has associated alphabet S and is defined by 

L(M) = {w G E*|M accepts w} 

We denote by tM{w) the number of steps in the computation of M on input w. If this 
computation never halts, then tM{w) = oo. For n G JV, we denote by TM{n) the worst case 
run time of M; that is 

TM{n) = max.{tM{w)\w G E"} 

where E" is the set of all strings over E of length n. We say that M runs in polynomial time 
if: 

3k &IN such that {Vn ■.TM{n)<n^ + k } 
Definition I.l: We define the class P of languages by 

P = {L\L = L{M) for a machine M which runs in polynomial time} 

B. The class NP of languages 

The notation NP stands for nan deterministic polynomial time, since originally NP was 
defined in terms of non deterministic machines. However, it is customary to give an equiv- 
alent definition using the notion of a checking relation, which is simply a binary relation 
i? C E* X E* for some finite alphabets E and Ei. We associate with each such relation R a 
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language Lr over SUl]iU{#} defined by 

Lr = {w#y\R{w,y)} 

where the symbol # is not in S. We say that R is polynomial-time iff Lr e P. 

Definition 1.2: We define the class NP of languages by the condition that a language 
L over S is in NP iff there is k G IN and a polynomial-time checking relation R such that 
for all weT,*, 

w & L 3y{\y\ < \w\^ and R{w, y)) 

where \w\ and |y| denote the lengths of w and y, respectively. We say that y is a certificate 
associated to w. 

C. The P - NP question 

The "P versus NP problem", i.e. the question whether P = NP or P ^ NP , is an open 
question and is the core of this paper. See [4] for the history of the question. Here, we 
show that neither P = NP nor P ^ NP can be proved in the "a-temporal" framework 
of Mathematics where results should always be reproducible. We link this assertion to the 
existence of some pseudo-random part in the heart of any NP problem. 

D. An example of NP problem : the 3-CNF-satisfiability problem 

Boolean formulae are built in the usual way from propositional variables Xi and the logical 
connectives A, V and -i, which are interpreted as conjunction, disjunction, and negation, 
respectively. A literal is a propositional variable or the negation of a propositional variable, 
and a clause is a disjunction of literals. A Boolean formula is in conjunctive normal form iff 
it is a conjunction of clauses. 

A 3- CNF formula is a Boolean formula in conjunctive normal form with exactly three 
literals per clause, like ip := (a;i V a;2 V -^x^) A (-10:2 V 0:3 V -12:4) := Vi A V'2- The 3-CNF- 
satisfiability or 3-CNF-SAT problem is to decide whether there exists or not logical values 
for the literals so that <p can be true (on the previous example, <^ = l(True) if xi = ^X2 = 1). 



Until now, nobody knows whether or not it is possible to check the satisfiability of any given 
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3-CNF formula <^ in a polynomial time, as the 3-CNF-SAT problem is known to belong to 
the class NP of problems. See [2] for details. 

Let us give some general properties of the 3-CNF formulae. 

The size s of a 3-CNF formula <^ is defined as the size of the corresponding Boolean circuit, 
i.e. the number of logical connectives in ^p. Let us note the following property of the size s : 

s = 0{m) = 0{7t^) (1) 

where n is the number of prepositional variables Xi and m the number of clauses in ip. 
Indeed, 

n „on(n — l)(n — 2) , 

- < m < 2^— ^ '- and (3to - 1) < s < (6m - 1) 

as there is a maximum of 2^ x C3 possible clauses which corresponds to the choice of 3 
different variables among n, each of them being in an affirmative or negative state. Note 
that s = 3m — 1 when there is no "-1" in <^ [m x 2 logical connectives "V" for the Vi and 
m — 1 "A" as conjonctions] and s = 6m — 1 when all the litterals in ip are in a negative form. 

In this paper, we define the dimension of a 3-CNF formula as (n,m). And we represent 
any 3-CNF formula by a matrix A of size 2n x m. The signature Ui of a clause ipi is defined 
as the value of the binary number corresponding to the row in the matrix. The signature 
of a form,ula is the ordered vector of these clause's signatures : Lpn,m ~ (wi,ii2, • • • with 
21 < < 21 • 22"-5 and Ui > Uj for i < j. See Table I . 



■t/ji : (a;i Va;2 V-.a;3) 
A tp2 : {^X2 VX3V -^Xi) <^ 
A tp3 : (-ia;i V -ixs Va;4) 



3-CNF i 


ormuia {dimension d = (4,3)) 


Xi -1X1 


X2 -^X2 


X3 -'X3 


X4 -iXi 


1 

1 


1 
1 



1 

1 
1 




1 

1 



TABLE I 

Example of matrix representation and signatures of a 3-CNF formula. 
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There are 2^ x C3 possible clauses with n variables. A 3-CNF formula with dimension {k,m) 
with k<n is composed of m different clauses drawn from the 2^ x possible clauses. So, 
the total number of such formulae is 

Cr-"-= , S^^^' , =0(n3-) (2) 
m! X (2'^ X C3 -m)! 

Let $n,TO denote the set of all these formulae : 

^n,m = {<fi ■ if is a 3-CNF formula of dimension (k,m) with k<n} 
The 3-CNF-Satisfiability problem is to find a function S : 

S:$„,„^{0,1} (3) 

if (y9 is non satisfiable and 1 otherwise 

The 3-CNF-Satisfiability problem is known to belong to the NP class. 



II. A "Meta Mathematical" proof that P ^ NP is impossible to prove 



One way to prove that P ^ NP is to show that the complexity measure Tuiji) for some 
NP problem, like the 3-CNF-SAT problem, cannot be reduced to a polynomial time. We 
will show that the 3-CNF-SAT problem behaves as a common safe problem and that its 
complexity is time dependent. In fact, at some specific time to -|- At, the 3-CNF-SAT problem 
will be of polynomial complexity. So, P ^ NP will not be provable, as TM{n) is not "always" 
supra-polynomial. 

A. The analogy with the safe problem and the time dependent nature of complexity 

Finding whether or not a given 3-CNF formula (p is satisfiable is like being in front of a safe, 
trying to find the opening combination. One has to try any possible value (0 or 1) for the 
variable Xi in (p to see whether some combination satisfies (p, in the same way as one tries 
any combination to get the one, if it exists, that opens the safe. 



Let us consider more deeply the analogy between the 3-CNF-SAT problem and the safe 
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problem, especially by looking to the time dependent nature of the complexity involved here. 
It is clear that when you are in front of a safe for the first time, it is a very hard problem, 
as you do not have any information about the correct opening combination. In fact, in the 
worst case, it takes an exponential time to find it. But as soon as you have succeeded in 
opening the safe (or in finding that there is no solution), the problem becomes trivial. It 
takes only one operation to open the safe or to declare it impossible to open. 

Let us denote by to the first time you try to open the safe, and by At the time needed to 
find the solution. Let us remark that At can be huge but it is always finite as the number 
of possible combinations is finite. Now we compute the complexity measure Tgafein) for the 
safe problem at to and to + At. 

In to, one has to test all possible combinations. If the safe has n buttons with only two 
positions (0 or 1), there will be 2" possibilities. Because no information is available about 
the solution, there is no way to reduce the number of cases to be tested. The exponential 
complexity of the problem comes from the total lack of information about the solution. This 

absence of information is strictly related to the random nature of the problem : the finding 
of the opening combination is a random search process for anyone in front of the safe, at 
least in to- So, we get 

Tsafe, toW=2" 

But after At, the correct opening combination is known forever, and the complexity measure 
is now 

Tsafe, to+At{n) = 1 

As one can see, the complexity measure Tgafdn) for the safe problem is time dependent. 

The same occurs for the 3-CNF-SAT problem as well as for any NP problem. Their com- 
plexity measure changes in time. The idea of this section about the impossibility to prove P 
^ NP is to show that, even if T^-cnf-sat, to(^) is not known (exponential or polynomial 
?), there exists some At, even huge, such that the complexity measure is polynomial in 
to + At. 
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B. The Computation of Ts-CNF-SAT,to+At{n) 

Let us take At large enough so that S [the 3-CNF-SAT decision function, see equation (3)] 
is known for all the 3-CNF formulae in ^n,m- At exists and is finite. In the analogy with the 
safe problem, it corresponds to the time needed to find the solution for all safe equipments 
of dimension n. Until now, we do not know whether S can be computed in polynomial time 
or not, but this only changes the size of At. 

The output of S is the set Sn^m of all satisfiable 3-CNF formulae of ^n,rm or equivalently 
Sn,mi the set of all non satisfiable 3-CNF formulae. As equation (2) shows, 
Sn,m contains at most 0(n^™) elements. The worst case occurs when m = (2^ x C3 )/2 = 
0{n^). As Sn,m C S'ra.m, the equation (2) gives us the following result : 



#{5„,™} = 0(n3("')) ^ #{5„,™} = 0(2"') asn3>2 (4) 



See Figure 1 for an example of #{$n,m} and with n = 4. The figure shows that 

#{^n,m} and #{Sn,m} behaves similarly. 

So, one can now calculate T^i-CNF-SAT, to+At (?^) : it is the time required to check whether a 
specific 3-CNF formula belongs or not in Sn,m, after At large enough for the entire set Sn,m 
to be computed. If one can allocate an exponential space for memory to save the elements 
of Sn,m (as accepted in Turing machines), then a hash algorithm, based on the clause's 
signatures, can be used to see whether a 3-CNF formula </? belongs or not to the set Sn,m- 
For instance, one can use Ui, the z*'' ordered signature of clauses, as the i"* successive hash 
function hi{(fi). It takes 0{2n) operations to compute each of these m clause's signatures of 
and C>(mlogm) computations to sort them. We need then 0{2^ x C3 ) operations, which 
corresponds to the maximum number of possible values for the signatures, to find whether 
the signature belongs or not to the corresponding section of Sn,m where the formulae are 
also ordered, in a lexical ordering, following their clause's signatures. Using equation (1) 



[i.e. 0{m) = 0{n^)\, 



T3_cNF-SAT,to+At{n) = 0(m(2n) + (mlogm) +m{2^C^)) 



= 0{m^) = 0{n'') for some A; e IV 



(5) 
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Fig. 1 

Logarithmic scale : the upper curve represents the total number of all possible 3-CNF in 
^4,™; the second one, the total of non-satisfiable 3-cnf, i.e. ^{s^.m}, and the lower one, 
THE TOTAL OF Irreducible N on- Satis fiahle 3-CNF, i.e. ^{S^^ } (SEE SECTION B FOR THE DEFINITION). 



C. The "unprovability" of P ^ NP 

Theorem IL 1 : It is impossible to prove that P ^ NP in the deterministic or time inde- 
pendent framework of Mathematics. 

Proof: The solution of the 3-CNF-SAT problem is equivalent to the setting of these two 
functions S' and S" : 

(In to) S' : ^n,m — ^ {Ojl} (the construction of Sn,m) 

ip if G Sn,m and 1 otherwise (6) 

7 

(V G Sn,in when Sn.m is known) 
ip ^ if 1^ e Sn,m and 1 otherwise (7) 



(In to + At) S":<i>„^,„''-^^{0,1} 



The meta mathematical argument lies in the fact that any operation done by 5' in can be 
reduced to a polynomial time operation by S" in to + At ^ . 

^ To make it easier to understand, let us think of the version of 3-CNF-SAT with n — A : it took us several months 
to build Sn,m; but now it only takes seconds to solve the 3-CNF-SAT problem with 4 variables. And this is done 
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Mathematically speaking, it is impossible to make a formal or mathematical distinction be- 
tween both functions S' and S", as time does not interfere with proofs in mathematics. 
More precisely, if someone proves that the 3-CNF-SAT problem S (or S') is non polynomial, 
this assertion, as well as the steps for the demonstration, should be true at any time, inde- 
pendently of t, even in to + At. The proof could not introduce time in the demonstration. 
But people will only be able to proof the non polynomial nature of 3-CNF-SAT for time to, 
certainly not for time to + At as shown in equation (5). And this argument holds for all 
NP problems because all of them are equivalent, in term of complexity, to the 3-CNF-SAT 
problem. ■ 
This is exactly the same situation as with the safe problem : the complexity measure of the 
problem is changing over time, becoming polynomial after some large At. But the P - NP 
question does not consider time as far as complexity is concerned : if we do not consider the 
time dependent nature of complexity, one should conclude that P = NP . The next section 
will show that it is not so clear. 



III. A "Meta Mathematical" proof that P = NP is impossible to prove 



A. The "P = NP " assertion is not equivalent to "Not P ^ NP " 

The previous time dependent argument is no longer valid with respect to P = NP , as we can 
have TM,to{n) = TM,to+At{n) = 0{n^) in this case. Indeed, from a strict mathematical point 
of view, one should accept that P = NP as soon asP ^ NP is proven to be impossible. But, 
if we take into account the time dependence of the complexity measure TM(n), the assertion 
"P = NP " does not mean solely the contrary of "P 7^ NP " , even if both assertions are 
mutually exclusive. 

Indeed, "P = NP " can be rewritten as 

TM,t{n) = TM,t+At{n) = 0{n'') Vt, At and for any problem M in ATP (8) 
forever. A similar reasoning can be done for the i*^ decimal of tt, or for the list of the n first prime numbers. 
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The idea in this section is to show that for any NP problem M, there will be a time t 
where TM,t{n) » 0{n^) [TM,t{n) = ri(2"), for instance^]. Therefore, equation (8) will not 
hold and the assertion "P = NP " will be false. The idea here is to point out some random 
property related to a special class of 3-CNF formulae, the INS 3-CNF formulae, as we did 
with the safe problem when we computed Tsafe,to{i^)- 

B. The class of INS 3-CNF formulae 

Let us first introduce the notion of Irreducible Non Satisfiable (or INS) 3-CNF formulae. 

Definition III.l: An INS 3-CNF formula is a non satisfiable 3-CNF formula <p* „ such 
that any smaller sub-formulae ipk,i {k < n, I < m) of <y9* ^ is satisfiable. This means that 
the non satisfiability nature of „ requires the entire set of the m clauses of ^. 
The argument in the following section is to divide the 3-CNF-SAT problem into two separated 
and "orthogonal" problems : the INS-3-CNF-SAT and the INS-Reduction prohlems. 

C. The "unprovability" of P = NP 

Lemma III.l: For some time t + St, the 3-CNF-SAT problem is fl{2"), even if one can 
solve the INS-3-CNF-SAT problem in 0(n'=). 

Proof: The core of this proof is to concentrate our attention, not on the satisfiability 
characteristic of <fin,m, but on the non necessary clauses in (pn,m- 

1. Let us suppose that, for some time t, we have got enough time to build the set 
^nln of tlis 3-CNF iPnm- shown in equation (5), at time t, it takes 
0{n'') computations to check whether or not a given formula (^* „ belongs to 5^^, 
as S^'^^'c 9 

2. Let ^ be an INS 3-CNF formula in S^fl^ . From </3* ^, we generate a new 
non satisfiable formula (fn,2m with 2m clauses, by adding randomly m extra clauses. 
These clauses can be considered as noisy extra clauses. This random generation is 
over at time t + 6t. 

3. At time t + 5t (remember that we have knowledge of 5^^, from time t), we want 
to check whether or not <^n,2m belongs to Sn,2m [the general 3-CNF-SAT problem, 
with no information about Sn.2m]- 

fl{2") means that the computation time is larger than 2" (i.e. exponential). 
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Our 3-CNF-SAT algorithm on ipn,2m will use the information about <S^^ as this 
information is related to the most difficult part of the algorithm (the non satisfiability 
property of a 3-CNF formula). Moreover, by hypothesis, at time t + 6t, this sub- 
algorithm is supposed to be polynomial for any INS 3-CNF formula. 
So, the 3-CNF-SAT algorithm will have to find, inside the clauses of ifin,2rm the 
added or noisy clauses, so that it can find the hidden INS sub-formula <p* ^ in 
Vn,2m- Let US Call this search the INS-Reduction problem. We have thus divided the 
3-CNF-SAT problem in two orthogonal problems : the INS-3-CNF-SAT problem 
(in C'(n'^)) and the INS-Reduction problem. 

4. Let us now prove that, at time t + St, the INS-Reduction problem is $7(2") for our 
3-CNF formula (pn,2m- Once again, we use a meta mathematical argument, based 
on some property of true randomness. 

To the Irreducible Non Satisfiable formula (^* one can add any extra clause with- 
out changing the non satisfiable nature of the obtained formula. These added clauses 
can be selected in a totally arbitrary way, with respect to iPn,m (except that all 
clauses should be unique). So, one can add to t^* ^ many different clauses, in a ran- 
dom way, without link with (fi^.m- One possible random output of this generation 
process can be our peculiar formula V5„,2m- 

In fact, ipn,2m can be seen as the final output at time t + 6t oi a random process 
beginning with (^* ^ at time t. If we look at the process in a backward way, we 
see that there are different possible random processes beginning with different 
(Pnm' which lead to a peculiar (pn,2m- Mathematically speaking, it is impossible to 
distinguish the given formula ifin,2m from the result of a true random process. And 
if fn,2m is truly a random output, we are then in presence of a problem similar to 
the safe problem in time to, when a random search process was needed to find the 
solution. There is no way to get useful information for the search of (p* „ inside 
^n,2m [tfie INS-Reduction problem] . So, one has to check all possible combinations 
for the sub- formula ^ and then see whether this sub-formula belongs or not to 
Si";^ (in 0(n'=)). 

And this INS-Reduction algorithm takes at least an exponential number of operations 
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(in fact^ : = 0(2™)). Using the fact tliat m = n{n) (see equation (1)), the 3- 
CNF-SAT problem for any formula like ipn,2m is 0(2™ x n*^) = 0(2") =0(2"), at 
least at time t + St. 

m 

Theorem III. 2: Any 3-CNF-SAT algorithm should contain, at some time t + 5t, a sub- 
algorithm equivalent to the INS-Reduction algorithm, and therefore is 0(2"). So, it is 
impossible to prove that P = NP , in the sense defined in equation (8). 

Proof: The proof of this assertion is based on the very nature of the 2m clauses of ipn,2m '■ 
m of them are mathematically related to the non satisfiability property of (pn,2m, while the 
other m clauses are totally unrelated (as noise) to it. Any 3-CNF-SAT algorithm for such 
formula as (p„^2m should handle, in some way, these noisy extra clauses. And, as these 
extra clauses can be anything (totally random), there is no way to escape some exponential 
INS-Reduction (or random search) process to get rid of them. ■ 
Once again, the pseudo random nature of the NP problem arises in the reflection. It is 
because of the possible randomness within the generation of the extra clauses (from (^* ^ to 
^n,2m) that there is no eflicient or polynomial way to find back ^ inside (Pn,2m, and thus 
3-CNF-SAT cannot be proved to be in P because of that. 



IV. Conclusions 



? 

This paper tries to show that the P = NP problem is impossible to solve within the time 
independent framework of Mathematics, as neither P = NP nor P^NP can be proved 
without reference to time. The key concept of the paper is the temporal nature of the com- 
plexity measure for the iVP— hard problems. This time dependence is closely related to 
some (pseudo) randomness in the heart of these problems. Some analogy can be found with 
the Chaos theory, when pseudo randomness arises from deterministic processes. 

For the author, ATP is really different from P but the difference lies in the distinction be- 
tween true randomness and mathematical pseudo-randomness, and this frontier is situated 
^ See Appendix for a proof. 
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on the limit border of Mathematics (which is deterministic) . 

The impossibihty for a solution to P = NP gives a new perspective on the pseudo non 
deterministic (or random) nature of the most difficult problems, the iVP— hard problems : 
we can see these problems as so inextricable that we are in front of them like someone fac- 
ing some random search problem (as the safe problem), even if they are deterministic (not 
random) in their very essential nature, i.e. as quasi chaotic problems. 

Therefore, the P — NP "unprovability" can be seen as the expression of the incapacity for 
Mathematics to give a time independent definition of randomness. 



V. Appendix : Details about the exponential complexity of the 
INS-Reduction Process 

A. Preliminaries 

Let (^* 2m be the 3-CNF formula to be reduced, and ipn,p be any sub- formulae of 2m- We 
suppose that y:>n,2m ^ random extension of some ipn,m in <5^^, where S^^p^ denotes the 
set of all Irreducible Non Satisfiable 5- C7VF formulae (fin,p of dimension (n,p). These sets 
are supposed to be known here. 

The INS-Reduction Process checks whether there exists tpn^p in S^^p^ , for some p < 2m, such 
that ipn,p is a sub-formula of 2m ^■nd Vn,p is Irreducible Non Satisfiable. We will prove that 
this process has an exponential complexity : 

TlNS-Reduction,t+St{n) = fl{2"). 



B. The two approaches for the INS-Reduction Process 

In fact, there arc only two major ways to check whether or not there exists a sub-formula 
of 'Pn,2Tn in ^l^p^ {P < 2m). Any INS-Reduction algorithm will be a mixture of these two 
approaches : 

1. Prom 'Pn^2m to ^n^p^ '■ '^^^ Considers all the possible sub- formulae of v'n,2m 
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with dimension {n,p) (p < 2m), and then checks whether these sub-formulae befong 
to sl^p^ ; one stops as soon as such a sub-formula is found. By hypothesis, the 
algorithm will stop with p = m as (/?* 2m is an extension of some ipn.m G >5„ ^ ■ 

2. From 5^^'^ to (Pn^2m • for each formula ipn,p in 'Sn^J'ip < 2m), one checks 
whether tpn^p is a sub-formula of 2m i one stops as soon as such a sub-formula is 
found. Here again, p = m at the end of the process. 

C. Complexity of both approaches 

1. Because of the pseudo random nature of the first algorithm is required 
to consider all the sub- formulae of <p* 2m of dimension {n,p){p < 2m). As <p* 2m i^ 
an extension of some <fin,m, the algorithm will consider X^^^ Cp™ = f2(2™) differ- 
ent sub-formulae. For each of these sub-formulae, it takes 0{n'^) operations (see 
equation (5)) to check whether or not it belongs to 5^^^. So, the first algorithm is 
1](2™) X 0{n'') = f}(2™) = f](2"). 

2. Because of the pseudo random nature of i^* the second algorithm is required 
to consider all the formulae belonging to <S^^'^(p < 2m). As ^n,2m i^ an extension of 
some ipn,m, the algorithm will consider different INS 3- CNF formu- 
lae. For each of these formulae, it takes 0{n'^) operations to check whether one gets 
or not a sub- formula of ip* 2m- This is just a classical string searching algorithm, 
which has polynomial complexity. So, the complexity of the second algorithm will 



be E;1i#{C^}xO( 



By proving in the next section that X^^^ 4f^{^^n^p} is ^(2"), we show that both approaches 
for the INS-Reduction process are equivalent in terms of complexity. And this holds for any 
mixture of these approaches. 

m 



D. Theorem : ^#{<S„_p } = 0(2") for m > 
D.l Notations 



2n-a+cn_l 

p=l 



Let (pn,m G $n,m bc & 3-CNF formula with propositional variables xi,--- ,a;„ and clauses 
■011 • • • 1 V'm- Let ^„ be the set of the 2^ x C3 possible clauses with n variables, and {0, 1}" 
be the set of all possible logical values for the variables. 
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Let 5 : ^- {0, 1}" : ff(V'i) = C {0, 1}" where 5^ = {v e {0, 1}" : ^j{v) = 0}. For 

instance, ^Jj = XiV X2V X4{€ ^4) leads to g{i/jj) = Sj = {(0, 0, 0, 0), (0, 0, 1, 0)}. See Table 
II where v = (a, b, c, d) corresponds to the column i = a + b.2 + + It is clear that 
#5j = 2"-3. 

Let us define g'^ {v\(pn^rn) — {"^j • V'jC^) = 0, ipj in (pn,m}- We have that 
iJ={g^ {v\ipn,m)} ^ 9,s each V can correspond to maximum C"' clauses. 
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TABLE II 

Graph of g : Value 1 in the i*^ column corresponds to v & Sj, where v = {a,b,c,d) with 
a + b.2 + + d.2^ = i. The box is the first pivot vi and the underlined elements are the discarded values 

for the next pivot V2 (see section (D.3)). 



D.2 Sufficient and necessary conditions for non satisfiability 

m 

Theorem V.l: A 3-CNF formula ^Pn,m = /\ V'j is not satisfiable iff 

m m 

U5(^,) = U5, = {o,ir. 

Proof: 
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ipn,m is not satisfiable <^ Vw e {0,1}" : ifin,m{v) = 

<^Vwe{0,l}" : 3j:i;j{v)=0 

<i^Vwe{0,l}" : 3j:veSj 

m 



Corollary V.l: As Sj C {0, 1}", a 3-CNF formula <pn,m is not satisfiable iff 

m 

#{U5,} = 2". 

Looking at Tabic II, it is easy to verify if sonic forniulB. V^4,m 

is satisfiable or not ; one 

just has to check the existence of a "1" in each column, when one looks only at the lines 
corresponding to the clauses V'j of </?4,m- For instance, it is clear that the first 8 clauses taken 
together are non satisfiable. 



Theorem V.2: A 3-CNF formula ipn,m is not satisfiable in the INS-3-CNF-SAT sense, 
i-e. fn^m G {'5^^}, iff \JjSj = {0,1}" and V Vj,3 u G Sj such that g'^{v\ipn,m) = {V'j}- 
Definition V.3: This variable v is called the pivot for ipj. 
Proof: On the contrary, let us suppose that \JjSj = {0,1}" but 3 tjjj such that V v G 

Sj,g^{v\ipn,m) {V'il- Of course, ipj C g-^ {v\(pn,Tn) as w e Sj. Thus, 

\/ V G Sj,3ij)1 ^ tpj such that Vfc C g'^{v\ipn,m) 
■i^yvQSj,3kj^j such that v G Sk 

m m 

Vn,m is not satisfiable, even when the clause ipj is deleted. 
Vn,m is not an Irreducible Non Satisfiable formula. 



From Table II, we see that a formula (^4,™ is a INS formula if for each clause in (^4,to there 
exists at least one column with only one "1" in it. For instance, it is clear that the first 9 
clauses taken together are not Irreducible Non Satisfiable, as there exists two "1" in columns 
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11 and 15 for the 9*'* clause. This last clause can be considered as a noisy clause, as there 
is no pivot for it. 

Theorem V.4: Only formulae (pn.m with 8 < m < 2"-^+{C''-i) ^max can be non 
satisfiablc in the INS-3-CNF-SAT sense. 
Proof: • As #Sj = 2"-3 : 

m 

iPn,m is non satisfiable <^ <Sj = {0, 1}" 

m 

^^#5,>2" = #{0,ir 
^ TO 2"-3 > 2" 
=^ TO > 8 

• As ipn,m is not satisfiable in the INS-3-CNF-SAT sense : 

(Pn,m is not satisfiable o (JiSj = {0,1}" and Vj 3vj G Sj : g'^{vj) = {V'j} 

3 

I #{9^{v\<Pn,m)} < C's Otherwise 

^ E #{5^(^IV'n,m)}<m + C3"(2"-TO) 

1)G{0,1}" 

[ #{5^(f^|<^n,m)} is the total number 

we{o,i}" 

of relations between the m clauses and {0, 1}" ] 
But we know that this total number of relations is also equal to 

rn rn 

So, we have : 

TO X 2"-3 < m + C3"(2" - to) 
2" 

- 2"-3 _|_ (7" _ 1 ~ ^rnax 
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m 




'max 



4 



m 



]J[2"C^ - i{2"-3cj + (2'^-3 - l)(c^ - 1)}] 





p=i 



i=0 



= f2(2") 



(for m > rUmax = 2n-3+cn_i ) 

Proof: The idea is to build the set of all INS-3-CNF formulae from the graph of g, by 
choosing recursively the pivots for these formulae. Indeed, any INS-3-CNF formula (pn,m is 



by counting the number of possible choices for (ui, • • • ,Vm), with 8 < m < rrimax- Remember 
that Vvj : g'^{vj\(pn,m) = {V'j} (see definition V.3). 



vi among the 2" ^ elements associated to tpi (see Table II for an example of pivot). So, 

there is 2^C^ possible choices for the first pivot. 

• For the choice of the second clause V'2 and pivot V2, we have to discard those clauses ip such 
that e g^{vi\(pn^i = -01), i.e. those clauses with a "1" in the column of vi in the table. 
C3 clauses (and the 2"^'^ corresponding table elements) should be discarded at that stage, 
otherwise g'^ (vi\(pn,2) 7^ {V'l}- Looking at the other elements v in iSi (the line corresponding 
to we have to reject as future candidate for the next pivot, the elements of the table 
in the columns of these V gSi. We should discard (2"-^ - 1)(C^ - 1) elements, i.e. the 
number of elements in <Si different from vi times the number of non null elements in each 
column, not in <Si. Indeed, if we take such an element as our next pivot V2, we will get 
g'^{v2\(pn,2 = 4'! = {^1)^2} 7^ {^"2} and that is contrary to the definition of a pivot. 

In Table II, this corresponds to the 3 underlined "1" in column 13, the third line Si not 
being taken into account. In summary, there are 2^0^ - + (2"^^ - 1){C^ - 1)} 

possibilities for the second pivot V2. In Table II, this means "32 - 11" possibilities. 



characterized by {vi 



■■■ ,Vm), where vj is the pivot for the clause i/jj . We will get ^ #{<5„_p } 



• The first step is to choose a clause Vi from the 2^C^ possible clauses, and then a pivot 
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• The third (vs) and following steps are similar. We discard the same number of elements 
from the table at each stage, or less if we choose a pivot such that some redundancy appears 
with previous deletions. From any previous choice of pivots (ui, ?;2. '-'3. ■ ■ ■ )' it is always 
possible to build at least one INS-3-CNF formula ipn,m with 8 < m < rumax '■ as soon 
as there is no possible choice for the next pivot, this means that we have a INS-3-CNF 
formula. We then put the value "1" for the following terms in our product, so that the total 
number of possibilities remains unchanged. This is done by introducing the following term 
max{l, [2"Q - i(2"-3Q + (2"-^ - l)(c^ - 1))]} in our overall product. 

• Let us note that for large n, the term [2"Q - i{2"-3Q + (2"-3 - l)(c^ - 1)}], which 
corresponds to the minimum value for the number of possibilities at stage i, becomes negative 
for i > 5, so we get that max{l, [2"C^ - i{2"-^CS + (2"-^ - l)(c^ - 1))]} = 1 V i > 5. We 
can thus limit our product to i = 4. Indeed, the overall product 

n max{l, [2"C3" - i{2^-'CS + (2"-^ - l)(c? - 1))]} 

i=0 

4 

= JJ[2"CJ - i{2"-3CJ + (2"-3 - 1){4 - 1)}] 

corresponds to a minimum value for the number of possible ways for choosing the five first 
pivots in the building of our INS-3-CNF formulae. 

• We have now the pivots Vi and their corresponding clauses ipi, such that 
g"^ {vi\ipn,m) = {''Pi}- The number m of the so-selected pivots will depend on the selec- 
tion, with 8 < m < rrimax- For each choice of {vq,--- ,V4), one can build a INS-3-CNF 
formula in (m — 5)! ways, depending on the ordering of (ws,-- - ,Um)- So, we get that 

nLo[2"C'3 - i{'2'^'^C^ + (2""^'' - l)(c^ - 1)}] X (m - 5)! is a minimum value for the number 
of possible choices for the m pivots. Let us remark that these m pivots, as well as their cor- 
responding m clauses can be selected in m\ different orders, as all these ordered selections 
are equivalent in terms of Irreducible Non Satisfiability. We should therefore retrieve the 
ordering by dividing by m!. 
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• Putting all together, we have 

JJ[2"CJ - i{2"-3CJ + (2"-3 - l)(c? - 1)}] X l!ZLl^ 

i=o 

possible different INS-3-CNF formulae built with our m pivots. As the term (m — 5)!/m! 
depends on the selected pivots, we replace it by a lower bound : 

(to -5)! ^ (//(,„„,: -5)! 1 



ml (mmax)! 

[[{mmax - i) 
i=0 



Finally, we get : 



p=l p=8 

4 

]J[2"CJ - i{2"-3CJ + (2"-3 - l)(c^ - 1)}] 



> ^=0 



4 
i=0 

> f2(2^") 

= n(2") forTO>TO„, - ^ 



2"-3 + C^-1 

m 
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